Work through the following 18 R problems. This assignment is worth 8 points (4/9 points per problem). You should also complete Assignment 3.2P OR Assignment 3.2R for an additional 2 points. The total of Assignments 3.1 and your choice of either 3.2P (Python) or 3.2R (R challenge) are worth 10 points and 15% of your final course grade (just like the other assignments). You must work individually, but feel free to ask questions in class and on Slack!
You must submit the problems as an R script (.R file). All answers must have comments and be labeled with the problem number. Some answers will only be comments without associated R code. For questions about output, report your output in a comment under the relevant line of code. Note that all comments have # in front of them so that R does not run this line of code.
This week’s assignment has a lot of exercises that you can tackle in different ways. So if you want to use a different approach or package than your classmates, that is fine! We also do not mind what colors you pick for your plots or if font sizes slightly differ. However, always make sure that you practice good, clear, and efficient coding style.
#Q1.1.0
#This is the answer the example problem Q1.1.0 in a comment
ThisVariable <- 1
ThisVariable
#[1] 1
#Q1.1.1
#Answer
Simulate some data and show them in a boxplot.
Make a scatterplot of the average temperature measured at Schiphol Airport over the last 70 years. The data can be found here: https://bit.ly/3GLVQ86 . Put time on the x-axis and average temperature (TAVG) on the y axis.
The day that the titanic sank was a bad day in many ways. Most importantly, because it helps present-day men to justify pro-male sexism with plots like this:
Can you recreate the barplot with the ggplot2 and titanic packages (dataset titanic_train has the passenger data)? Functions that might be useful are factor() and labs(fill = “my label”).
Try out different themes from ggplot for the previous plot. You can find ggplot’s “complete themes” here: https://ggplot2-book.org/polishing.html under point 18.2. Apply the one that you like best.
Improve the visuals of this plot in three ways and briefly list your edits in a comment.
plot(mtcars$cyl, mtcars$hp)
Use the built-in Orange dataset and recreate this ggplot. Make sure the bars are shown in the same order as here. You might need to google how to reorder them (it is a common nuisance). Notice that you have to compute the trees’ maximum circumference before plotting.
Recreate this plot from the Orange dataset. To prevent likely questions: Yes, the plotted line is based on all the data (i.e., not just a single tree).
Add a second plot next to the plot from exercise Q3.1.6 showing the development of the individual trees’ circumferences over time. An easy way to do it is with the patchwork package but you can also use the par() function.
Ever wondered if guinea pigs’ teeth grow better when they take their vitamin C through orange juice rather than meds? Make a plot that answers this question with the ggstatsplot package using the built-in dataset ToothGrowth. Hint when checking out the package: this is a BETWEEN subject comparison ;)
Recreate this 3D plot with the plotly package and the built-in iris dataset.
Recreate this animated plot with the gganimate package. You can get the data through the refresh_coronavirus_jhu() function from the package coronavirus. The exact animation speed and other visual settings do not matter for grading.